[WIP] Make WebGPU EP compatible with EP API#26907
[WIP] Make WebGPU EP compatible with EP API#26907fs-eire wants to merge 3 commits intomicrosoft:mainfrom
Conversation
1ca0d32 to
6873ade
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends WebGPU EP to support building as both a bundled EP (static library) and an EP API-based plugin EP (shared library). The changes introduce a new adapter infrastructure in include/onnxruntime/ep/ that bridges C-API objects to simulate ORT internal class behaviors, enabling compile-time switching between build modes with minimal code changes.
Key changes:
- New EP adapter infrastructure with header-only wrapper classes for kernel info, context, registry, and EP interface
- WebGPU EP API implementation (
api.cc,factory.cc/h,ep.cc/h) inonnxruntime/core/providers/webgpu/ep/ - Templated CPU tensor operator base classes to work with both native and adapter OpKernelInfo types
- Enhanced C API with
KernelInfo_GetOperatorType,KernelInfo_GetSinceVersion, andKernelInfo_GetEp - Test infrastructure updates including example kernel registry changes (Mul kernel renamed to BinaryOp supporting both Add and Mul)
Reviewed changes
Copilot reviewed 75 out of 75 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| include/onnxruntime/ep/*.h | New EP adapter infrastructure headers providing C-API wrappers for kernel registration and execution |
| onnxruntime/core/providers/webgpu/ep/* | WebGPU EP API implementation for factory, EP instance, and plugin entry points |
| onnxruntime/core/providers/cpu/tensor/*.h | Templated base classes for tensor ops to support both native and adapter kernel info types |
| onnxruntime/test/autoep/* | Test updates including BinaryOp generalization and additional test coverage |
| onnxruntime/core/session/*.cc | Core API additions for kernel info operator type, version, and EP retrieval |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| InlinedVector<float> scales_array(input_dims.size()); | ||
| // opset < 10 | ||
| if (OpKernel::Node().InputDefs().size() == 1) { | ||
| if (OpKernel::Node().SinceVersion() < 10) { |
There was a problem hiding this comment.
This check should verify info.node().SinceVersion() < 10 to be consistent with the comment. Checking input count may not be reliable as the input count could vary for other reasons. Use the opset version directly for clarity and correctness.
| if constexpr (std::is_same_v<KernelInfoType, onnxruntime::OpKernelInfo>) { | ||
| auto x_shape = node.InputDefs()[0]->Shape(); | ||
| if (x_shape != nullptr) { | ||
| rank = x_shape->dim_size(); | ||
| } | ||
| } else { | ||
| int is_const; | ||
| auto tensor = info.GetKernelInfo().GetTensorConstantInput(0, &is_const); | ||
| if (is_const) { | ||
| auto type_and_shape_info = tensor.GetTensorTypeAndShapeInfo(); | ||
| if (type_and_shape_info.HasShape()) { | ||
| rank = static_cast<int64_t>(type_and_shape_info.GetShape().size()); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The constexpr branch at line 223 checks for std::is_same_v<KernelInfoType, onnxruntime::OpKernelInfo> but at line 229, it tries to call methods on info.GetKernelInfo() which suggests that info is expected to be of type onnxruntime::ep::adapter::OpKernelInfo. However, if that's the case, the else branch would execute, and info.GetKernelInfo() would return an Ort::ConstKernelInfo, not an object with a GetTensorConstantInput method. This logic appears inconsistent and may cause compilation or runtime errors.
| /// </summary> | ||
| inline std::unique_ptr<onnxruntime::Tensor> CreateTensorFromApiValue(const OrtValue* ort_value) { | ||
| Ort::ConstValue value{ort_value}; | ||
| EP_ENFORCE(value.IsTensor(), "Only tensor OrtValue is supported."); |
There was a problem hiding this comment.
The macro ORT_ENFORCE depends on exception handling, but when building as a plugin EP, exceptions may not be available or may not behave as expected across DLL boundaries. The code should use return status patterns instead for error handling in the plugin EP build path.
| EP_ENFORCE(value.IsTensor(), "Only tensor OrtValue is supported."); | |
| if (!value.IsTensor()) { | |
| // In plugin EP builds, avoid exception-based enforcement and signal failure via nullptr. | |
| return nullptr; | |
| } |
| *allocator = new onnxruntime::ep::adapter::Allocator(factory->config_.initializer_allocator); | ||
| } else { | ||
| *allocator = new onnxruntime::ep::adapter::Allocator(factory->config_.device_allocator); | ||
| } |
There was a problem hiding this comment.
Memory leak: The allocated Allocator instance is never deleted. The caller receives a raw pointer without any indication of ownership. Consider documenting the ownership semantics or using RAII patterns.
| if (src_device_id != dst_device_id) { | ||
| int src_device_id = impl.ep_api.MemoryDevice_GetDeviceId(src_memory_device); | ||
| int dst_device_id = impl.ep_api.MemoryDevice_GetDeviceId(dst_memory_device); | ||
| if (src_device_id != impl.context_id_ || dst_device_id != impl.context_id_) { |
There was a problem hiding this comment.
The device ID check logic appears incorrect. The condition checks if either src_device_id != context_id_ OR dst_device_id != context_id_, which would fail even when copying from the same device. This should use AND (&&) instead of OR (||) to ensure both source and destination match the context.
4d451dc to
e52d706
Compare
9e3419b to
0e58206
Compare
3598465 to
902e4f9
Compare
Description
This PR makes it possible to build WebGPU EP as an EP API based plugin EP.
Requirements
The goal of this PR is to support both building WebGPU EP as a bundled EP and an EP API based plugin EP. This approach allows:
Design & Implementation
Instead of changing WebGPU EP from a bundled EP to an EP API based plugin EP in one shot, this PR extend WebGPU EP to support building as plugin EP.
add a new folder
include/onnxruntime/epwith a bunches of header files. Those files are not WebGPU specific. They are used for:onnxruntime::ep::Epto inherit fromThese header files allow a compile time "switch" to the different set of types to minimize changes to existing code. Specifically,
pch.his required to be included as PCH to make sure the "override" to take place correctly.add a new folder
onnxruntime/core/providers/webgpu/epfor EP API implementation, specifically:api.cc: implementsCreateEpFactoriesandReleaseEpFactoryep.ccep.h: implement classonnxruntime::webgpu::ep::Epfactory.ccfactory.h: implement classonnxruntime::webgpu::ep::FactoryDependencies and Prerequisites
(unmerged changes are included as a part of current PR)
Missing Parts